Privacy-Awareness of Distributed Data Clustering Algorithms Revisited

نویسندگان

  • Josenildo Costa da Silva
  • Matthias Klusch
  • Stefano Lodi
چکیده

Several privacy measures have been proposed in the privacypreserving data mining literature. However, privacy measures either assume centralized data source or that no insider is going to try to infer some information. This paper presents distributed privacy measures that take into account collusion attacks and point level breaches for distributed data clustering. An analysis of representative distributed data clustering algorithms show that collusion is an important source of privacy issues and that the analyzed algorithms exhibit different vulnerabilities to collusion groups.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy-based Consensus for Distributed Data Clustering

The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

A Survey on Location Based Services in Data Mining

Data privacy has been the primary concern since the distributed database came into the picture. More than two parties have to compile their data for data mining process without revealing to the other parties. Continuous advancement in mobile networks and positioning technologies have created a strong challenge for location-based applications. Challenges resembling location-aware emergency respo...

متن کامل

Comparison of distributed evolutionary k-means clustering algorithms

Dealing with distributed data is one of the challenges for clustering, as most clustering techniques require the data to be centralized. One of them, k-means, has been elected as one of the most influential data mining algorithms for being simple, scalable, and easily modifiable to a variety of contexts and application domains. However, exact distributed versions of k-means are still sensitive ...

متن کامل

An Efficient Distributed Data Clustering Algorithm

The k-means algorithm is one of the most popular clustering algorithms in use today. The high running time complexity of serial k-means limits its applicability for very large databases. On the other hand, the existing parallel kmeans algorithms demand huge data transfer operations incorporating high communication complexity. Transfer of actual data from local sites is also unacceptable, in man...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016